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Preface 


This tutorial introduces the libelf library being developed at the ElfToolChain 
project on SourceForge.Net. It shows how this library can be used to create tools 
that can manipulate ELF objects for native and non-native architectures. 

The ELF(3)/GELF(3) APIs are discussed, as is handling of ar(l) archives. 
The ELF format is discussed to the extent needed to understand the use of the 
ELF (3) library. 

Knowledge of the C programming language is a pre-requisite. 


Legal Notice 

Copyright © 2006-2010 Joseph Koshy. All rights reserved. 

Redistribution and use in source and binary forms, with or without modifi¬ 
cation, are permitted provided that the following conditions are met: 

• Redistributions of source code must retain the above copyright notice, this 
list of conditions and the following disclaimer. 

• Redistributions in binary form must reproduce the above copyright notice, 
this list of conditions and the following disclaimer in the documentation 
and/or other materials provided with the distribution. 


Disclaimer 

THIS DOCUMENTATION IS PROVIDED BY THE AUTHOR AND CON¬ 
TRIBUTORS ‘AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, 
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF 
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 
ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR AND CON¬ 
TRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, 
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTER¬ 
RUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, 
WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING 
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE 
USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSI¬ 
BILITY OF SUCH DAMAGE. 
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Chapter 1 

Introduction 


ELF stands for Extensible Linking Format. It is a format for use by compilers, 
linkers, loaders and other tools that manipulate object code. 

The ELF specification was released to the public in 1990 as an “open stan¬ 
dard” by a group of vendors. As a result of its ready availability it has been 
widely adopted by industry and the open-source community. The ELF stan¬ 
dard supports 32- and 64-bit architectures of both big and little-endian kinds, 
and supports features like cross-compilation and dynamic shared libraries. ELF 
also supports the special compilation needs of the C-|—I- language. Among the 
current set of open-source operating systems, the first ELF based release of 
NetBSD'^'^ was for the DEC Alpha"'"'^ architecture, in release 1.3 (January 
1998). FreeBSD'^'^ switched to using ELF as its object format in FreeBSD 3.0 
(October 1998). 

The libelf library provides an API set (ELF(3) and GELF(3)) for appli¬ 
cation writers to read and write ELF objects with. The library eases the task 
of writing cross-tools that can run on one machine architecture and manipulate 
ELF objects for another. 

There are multiple implementations of the ELF(3)/GELF(3) APIs in the 
open-source world. This tutorial is based on the libelf library being developed 
as part of the elftoolchain project on SourceForge.Net. 


Rationale for this tutorial 

The ELF(3) and GELF(3) API set is large, with over 80 callable functions. So 
the task of getting started with the library can appear daunting at first glance. 
This tutorial has been written to provide a gentle introduction to the API set. 


Target Audience 

This tutorial would be of interest to developers wanting to create ELF processing 
tools using the libelf library. 
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CHAPTER 1. INTRODUCTION 


1.1 Tutorial Overview 

The tutorial covers the following: 

• The basics of the ELF format (as much as is needed to understand how 
to use the API set); how the ELF format structures the contents of exe¬ 
cutables, relocatables and shared objects. 

• How to get started building applications that use the libelf library. 

• The basic abstractions offered by the ELF(3) and GELF(3) APIs—how 
the ELF library abstracts out the ELF class and endianness of ELF objects 
and allows an application to work with native forms of these objects, while 
the library translates to and from the desired target representation behind 
the scenes. 

• How to use the APIs in the library to look inside an ELF object and 
examine its executable header, program header table and its component 
sections. 

• How to create a new ELF object using the ELF library. 

• An introduction to the class-independent GELF(3) interfaces, and when 
and where to use them instead of the class-dependent functions in the 
ELF(3) API set. 

• How to process ar archives using the facilities provided by the library. 

1.2 Tutorial Structure 

One of the goals of this tutorial is to illustrate how to write programs using 
libelf. So we will jump into writing code at the earliest opportunity. As we 
progress through the examples, we introduce the concepts necessary to under¬ 
stand what is happening “behind the scenes.” 

Ghapter 2 on page 11 covers the basics involved in getting started with the 
ELF(3) library—how to compile and link an application that uses libelf. We 
look at the way a working ELF version number is established by an application, 
how a handle to ELF objects are obtained, and how error messages from the 
ELF library are reported. The functions used in this section include elf .begin, 
elf.end, elf.errmsg, elf.errno, elf_kind and elf.version. 

Ghapter 3 on page 15 shows how an application can look inside an ELF 
object and understand its basic structure. Along the way we will examine 
the way the ELF objects are laid out. Other key concepts covered are the 
notions of “file representation” and “memory representation” of ELF data types. 
New APIs covered include elf.getident, elf.getphdrnum, elf.getshdrnum, 
elf .getshdrstrndx, gelf_getehdr and gelf.getclass. 

Ghapter 4 on page 25 describes the ELF program header table and shows 
how an application can retrieve this table from an ELF object. This chapter 
introduces the gelf.getphdr function. 

Ghapter 5 on page 33 then looks at how data is stored in ELF sections. A pro¬ 
gram that looks at ELF sections is examined. The Elf.Scn and Elf JData data 
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types used by the library are introduced. The functions covered in this chap¬ 
ter include elf_getscn, elf_getdata, elf_nextscn, elf_strptr, and gelf_- 
getshdr. 

Chapter 6 on page 43 looks at how we create ELF objects. We cover the 
rules in ordering of the individual API calls when creating ELF objects. We look 
at the library’s object layout rules and how an application can choose to over¬ 
ride these. The APIs covered include elf_fill, elf32_getshdr, elf32_new- 
ehdr, elf32jnewphdr, elf_flagphdr, elfjndxscn, elf Jiewdata, elf_newscn, 
and elf_update. 

The libelf library also assists applications that need to read ar archives. 
Chapter 7 on page 51 covers how to use the ELF(3) library to handle ar archives. 
This chapter covers the use of the elf _getarhdr, elf _getarsym, elf _next and 
elf.rauid functions. 

Chapter 8 on page 57 ends the tutorial with suggestions for further reading. 
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Chapter 2 

Getting Started 


Let us dive in and get a taste of programming with libelf. 


2.1 Example: Getting started with libelf 

Our first program (Program 1, listing 2.1) will open a filename presented to it 
on its command line and retrieve the file type as recognized by the ELF library. 

This example is covers the basics involved in using libelf; how to compile 
a program using libelf, how to initialize the library, how to report errors, and 
how to wind up. 


Listing 2.1: Program 1 


#include 
#include 

#include 
#include 
#include 
#include 
#include 


<err.h> 

<f cntl.h> 


<libelf.h> 


□ 


<stdio.h> 


<stdlib.h> 
<sysexits.h> 
<unistd.h> 


int 

mainCint argc, char 
{ 


int fd; 

Elf *e; 3 


** argv) 


char *k; 
Elf_Kind ek; 


a 


if (argc != 2) 

errx (EX_USAGE , " usage : u/lsuf ile “name " , argv [0] ) ; 


a 


if (elf.version(EV_CURRENT) == EV.NONE) 

errx(EX_SOFTWARE, "ELFulibraryuinitializationu" 

"failediu’/os" , elf_errmsg(-l)) ; 
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if ((fd = open(argv[l], 0_RD0NLY, 0)) < 0) 

err (EX_N0INPUT . " openu\"/. s \ " uf ailed " , argv[l]); 


if 


((e = elf.begin(fd , ELF_C_REAdLL|, NULL)) == NULL) 
errx (EX.SOFTWARE , " elf .begin () uf ailed : u’/.s , 


elf.errmsg(-1)); 


a 


ek 


elf _kind(e); 


a 


switch (ek) {. 
case ELF.K.AR : 

k = "ar(1)uarchive " ; 
break; 

case ELF.K.ELF : 

k = "elfuobject"; 
break; 

case ELF.K.NONE: 
k = "data"; 
break; 
def ault : 

k = "unrecognized"; 

} 


(void) printf(""/.s: 

(void) elf.end(e); 
(void) close (fd); 

exit(EX.OK); 


"/.s\n" , argv [1] , k) ; 

a 


The functions and dataypes that make up the ELF (3) API are declared in 
the header libelf .h. This file must be included in every application that 
desires to use the libelf library. 

The ELF(3) library uses an opaque type Elf as a handle for the ELF object 
being processed. 

Before the functions in the library can be invoked, an application must 
indicate to the library the version of the ELF specification it is expecting 
to use. This is done by the call to elf .version. 

A call to elf.version is mandatory before other functions in the ELF 
library can be invoked. 

There are multiple version numbers that come into play when an applica¬ 
tion is manipulating an ELF object. 

• First, there is the version of the ELF specification (ui) that the ap¬ 
plication understands. 
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^ Application 


( ELF library 

V2 

U elf Object ^ 

^ Vi, V2 


^ i 1)2 } 


Figure 2.1: ELF versions 


• Second, we have the ELF version associated with the ELF object 
being processed (^ 2 ). 

• Third, we have the versions recognized by the libelf library: vi and 
V 2 . The library may know how to translate between versions Vi and 
V2- 


In figure 2.1 the application expects to work with ELF specification version 
(ui). The ELF object file conforms to ELF specification version (^ 2 ). The 
library understands both version vi and V 2 of ELF semantics and so is 
able to mediate between the application and the ELF object. 

In practice, the ELF version has not changed since inception, so the cur¬ 
rent version (EV_CURRENT) is 1. 


a 


The elf .begin function takes an open file descriptor and converts it an Elf 
handle according to the command specified. 


The second parameter to elf .begin can be one of ‘ELF.C.READ’ for opening 
an ELF object for reading, ‘ELF.C.WRITE’ for creating a new ELF object, 
or ‘ELF_C_RDWR’ for opening an ELF object for updates. The mode with 
which file descriptor fd was opened with must be consistent with the this 
parameter. 


The third parameter to elf .begin is only used when processing ar ar¬ 
chives. We will look at ar archive processing in chapter 7 on page 51. 


a 


When the ELF library encounters an error, it records an error number in an 
internal location. This error number may be retrieved using the elf .errno 
function. 


The elf.errmsg function returns a human readable string describing the 
error number passed in. As a programming convenience, a value of -1 
denotes the current error number. 




The ELF library can operate on ar archives and ELF objects. The 
function elf Jiind returns the kind of object associated with an Elf han¬ 
dle. The return value of the elf .kind function is one of the values defined 
by the Elf.Kind enumeration in libelf .h. 


When you are done with a handle, it is good practice to release its resources 
using the elf .end function. 


Now it is time to get something running. 

Save the listing in listing 2.1 on page 11 to file progl. c and then compile 
and run it as shown in listing 2.2 on the next page. 
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Listing 2.2: Compiling and running progl 


7. cc -o progl progl 


7. ./progl 
progl: elf 


:ogl 0 


prog] 
object 


c 


lelf [D 


7. ./progl /usr/lib/libc 
/usr/lib/libc.a: ar(1) 



archive 


□ 


The -lelf option to the cc comand informs it to link progl against the 
libelf library. 


a 


We invoke progl on itself, and it recognizes its own executable as ELF 
object. All is well. 


a 


Here we see that progl recognizes an ar archive correctly. 


Congratulations! You have created your first ELF handling program using 
libelf. 

In the next chapter we will look deeper into the ELF format and learn how 
to pick an ELF object apart into its component pieces. 


Chapter 3 

Peering Inside an ELF 
Object 


Next, we will look inside an ELF object. We will look at how an ELF object 
is laid out and we will introduce its major parts, namely the ELF executable 
header, the ELF program header table and ELF sections. Along the way we 
will look at the way libelf handles non-native objects. 

3.1 The Layout of an ELF file 

As an object format, ELF supports multiple kinds of objects: 

• Compilers generate relocatable objects that contain fragments of machine 
code along with the “glue” information needed when combining multiple 
such objects to form a final executable. 

• Executables are programs that are in a form that an operating system can 
launch in a process. The process of forming executables from collections 
of relocatable objects is called linking. 

• Dynamically loadable objects are those that can be loaded by an executable 
after it has started executing. Dynamically loadable shared libraries are 
examples of such objects. 

An ELF object consists of a mandatory header named the ELF executable 
header, followed by optional content in the form of ELF program header table 
and zero or more ELF sections (see figure 3.1 on the following page). 

• The ELF executable header defines the structure of the rest of the file. 
This header is always present in a valid ELF file. It describes the class 
of the file (whether 32 bit or 64 bit), the type (whether a relocatable, 
executable or shared object), and the byte ordering used (little endian or 
big endian). It also describes the overall layout of the ELF object. The 
ELF header is described below. 

• An optional ELF program header table is present in executable objects 
and contains information used by at program load time. The program 
header table is described in chapter 4 on page 25. 
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Figure 3.1: The layout of a typical ELF File 

ELF Executable Header ELF Section Contents 


EHDR 


PHDR 


SECTION DATA 


SHDR 


ELF Program Header Table 


ELF Section Header Table 


• The contents of a relocatable ELF object are contained in ELF sections. 
These sections are described by entries in an ELF section header table. 
This table has one entry per section present in the file. Chapter 5 on 
page 33 describes ELF sections and the section header table in further 
detail. 

Every ELF object is associated with three parameters: 

• Its class denotes whether it is a 32 bit ELF object (ELFCLASS32) or a 64 
bit (ELFCLASS64) one. 

• Its endianness denotes whether it is using little-endian (ELFDATA2LSB) or 
big-endian addressing (ELFDATA2MSB). 

• Finally, each ELF object is associated with a version number as discussed 
in chapter 2 on page 11. 

These parameters are stored in the ELF executable header. Let us now take 
a closer look at the ELF executable header. 


The ELF Executable Header 


Table 3.1 on the facing page describes the layout of an ELF executable header 
using a “C-like” notation. 


□ 


The first 16 bytes (the e_ident array) contain values that determine the 
ELF class, version and endianness of the rest of the file. See figure 3.2. 


Figure 3.2: The e_ident array 


endianness ELF version 
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Table 3.1: The ELF Executable Header 



32 bit Executable Header 

64 bit Executable Header 

a 

a 

a 

typedef struct 

{ 

typedef struct 

{ 

unsigned char 

e_ident[16]; 

unsigned char 

e_ident[16]; 

uintl6_t 

e_type; 

uintl6_t 

e_type; 

uintl6_t 

e_machine; 

uintl6_t 

e_machine; 


uint32_t 

e_versioii; 

uint32_t 

e_version; 

a 

a 

uint32_t 

e_entry; 

uint32_t 

e_entry; 

uint32_t 

e_phoff; 

uint64_t 

e_phoff; 

uint32_t 

e_shoff; 

uint64_t 

e_shoff; 


uint32_t 

e_flags; 

uint32_t 

e_flags; 


uintl6_t 

e_ehsize; 

uintl6_t 

e_ehsize; 

a 

a 

uintl6_t 

e_phentsize; 

uintl6_t 

e_phentsize; 

uintl6_t 

e_phnum; 

uintl6_t 

e_phnuin; 

uintl6_t 

e_shnum; 

uintl6_t 

e_shnuin; 

a 

uintl6_t 
} Elf32_Ehdr; 

e_shstrndx; 

uintl6_t 
} Elf64_Ehdr; 

e_shstrndx; 


Figure 3.3: The ELF Executable Header and Object Layout 
_ehske , Np^^^*e_phentsize * e_shentsize 


^4 -►! -►! 


Ehdr Phdr 

Shdr 

e.phoff 

e.shoff 



The first 4 bytes of an ELF object are always 0x7F, ‘E’, ‘L’ and ‘F’. 
The next three bytes specify the class of the ELF object (ELFCLASS32 or 
ELFCLASS64), its data ordering (ELFDATA2LSB or ELFDATA2MSB) and the 
ELF version the object conforms to. With this information on hand, the 
libelf library can then interpret the rest of the ELF executable header 
correctly. 


a 


The e_type member determines the type of the ELF object. For example, 
it would contain a ‘1’ (ET_REL) in a relocatable or ‘3’ (ET_DYN) in a shared 
object. 


a 


The ejnachine member describes the machine architecture this ELF object 
is for. Example values are ‘3’ (EM_386) for the Intel@ 1386"'"'^ architecture 
and ‘20’ (EM_PPC) for the 32-bit PowerPC"’’^ architecture. 


The ELF executable header also describes the layout of the rest of the 
ELF object (Figure 3.3). The e_phoff and e_shoff fields contain the file 
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offsets where the ELF program header table and ELF section header table 
reside. These fields are zero if the file does not have a program header table 
or section header table respectively. The sizes of these components are 
determined by the e_pheiitsize and e_shentsize members respectively 
in conjunction with the number of entries in these tables. 

The ELF executable header describes its own size (in bytes) in field 
e_ehsize. 

iia The e_phnum and e_shnum fields usually contain the number of ELF 
program header table entries and section header table entries. Note that 
these fields are only 2 bytes wide, so if an ELF object has a large num¬ 
ber of sections or program header table entries, then a scheme known as 
Extended Numbering (section 3.1 on the next page) is used to encode the 
actual number of sections or program header table entries. When extended 
numbering is in use these fields will contain “magic numbers” instead of 
actual counts. 


If the ELF object contains sections, then we need a way to get at the names 
of sections. Section names are stored in a string table. The e_shstrndx 
stores the section index of this string table (see 3.1 on the facing page) 
so that processing tools know which string table to use for retrieving the 
names of sections. We will cover ELF string tables in more detail in 
section 5.1.1 on page 37. 


The fields e_entry and e_flags are used for executables and are placed in 
the executable header for easy access at program load time. We will not look 
at them further in this tutorial. 


ELF Class- and Endianness- Independent Processing 

Now let us look at the way the libelf API set abstracts out ELF class and 
endianness for us. 

Imagine that you are writing an ELF processing application that is going 
to support processing of non-native binaries (say for a machine with a different 
native endianness and word size). It should be evident that ELF data structures 
would have two distinct representations: an in-memory representation that fol¬ 
lows the rules for the machine architecture that the application running on, and 
an in-file representation that corresponds to the target architecture for the ELF 
object. 

The application would like to manipulate data in its native memory repre¬ 
sentation. This memory representation would conform to the native endianness 
of the host’s CPU and would conform to the address alignment and structure 
padding requirements set by the host’s machine architecture. 

When this data is written into the target object it may need to be formatted 
differently. For example, it could be packed differently compared to the “native” 
memory representation and may have to be laid out according a different set of 
rules for alignment. The endianness of the data in-file could be different from 
that of the in-memory representation. 

Figure 3.4 on the facing page depicts the relationship between the file and 
memory representation of an ELF data structure. As shown in the figure, the 
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Figure 3.4: File and Memory Representations 

File Representation 
file size 



Memory Representation 

memory size 



size of an ELF data structure in the file could be different from its size in 
memory. The alignment restrictions (7of align and "/malign in the figure) could 
be different. The byte ordering of the data could be different too. 

The ELF(3) and GELF(3) API set can handle the conversion of ELF data 
structures to and from their file and memory representations automatically. 
For example, when we read in the ELF executable header in program 3.1 on 
the next page below, the libelf library will automatically do the necessary 
byteswapping and alignment adjustments for us. 

For applications that desire finer-grain control over the conversion process, 
the elf IWVjxlatetof and elf lWV_xlatetom functions are available. These func¬ 
tions will translate data buffers containing ELF data structures between their 
memory and file representions. 

Extended numbering 

The e_shnum, e_ph.num and e_shstrndx fields of the ELF executable header are 
only 2 bytes long and are not physically capable of representing numbers larger 
than 65535. For ELF objects with a large number of sections, we need a different 
way of encoding section numbers. 

ELF objects with such a large number of sections can arise due to the way 
GCC copes with G-l—I- templates. When compiling G-l—I- code which uses 
templates, GGG generates many sections with names following the pattern 
“.gnu.linkonce.nome”. While each compiled ELF relocatable object will now 
contain replicated data, the linker is expected to treat such sections specially at 
the final link stage, discarding all but one of each section. 

When extended numbering is in use: 

• The e_shnum field of the ELF executable header is always zero and the 
true number of sections is stored in the sh_size field of the section header 
table entry at index 0. 

• The true index of the section name string table is stored in field sh_link 
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field of the zeroth entry of the section header table, while the e_shstrndx 
field of the executable header set to SHN_XINDEX (OxFFFF). 

• For extended program header table numbering the scheme is similar, with 
the e_phnum field of the executable header holding the value PN_XNUM 
(OxFFFF) and the sh_link field of the zeroth section header table holding 
the actual number of program header table entries. 

An application may use the functions elf _getphdrnum, elf _getshdrnum and 
elf-getshdrstrndx to retrieve the correct value of these fields when extended 
numbering is in use. 


3.2 Example: Reading an ELF executable header 

We will now look at a small program that will print out the ELF executable 
header in an ELF object. For this example we will introduce the GELF(3) API 
set. 

The ELF(3) API is defined in terms of ELF class-dependent types (Elf 32_- 
Ehdr, Elf64_Shdr, etc.) and consequently has many operations that have both 
32- and 64- bit variants. So, in order to retrieve an ELF executable header from 
a 32 bit ELF object we would need to use the function elf32_getehdr, which 
would return a pointer to an Elf32_Ehdr structure. For a 64-bit ELF object, 
the function we would need to use would be elf 64_getehdr, which would return 
a pointer to an Elf64_Ehdr structure. This duplication is awkward when you 
want to write applications that can transparently process either class of ELF 
objects. 

The GELF(3) APIs provide an ELF class independent way of writing ELF 
applications. These functions are defined in terms of “generic” types that are 
large enough to hold the values of their corresponding 32- and 64- bit ELF types. 
Further, the GELF(3) APIs always work on copies of ELF data structures 
thus bypassing the problem of 32- and 64- bit ELF data structures having 
incompatible memory layouts. You can freely mix calls to GELF(3) and ELF (3) 
functions. 

The downside of using the GELF(3) APIs is the extra copying and conversion 
of data that occurs. This overhead is usually not significant to most applications. 

Listing 3.1: Program 2 

/* 


* Print 

*/ 

the ELF Executable Header from 

#include 

<err.h> 

#include 

<f cntl.h> 

#include 

<gelf.h> 

#include 

<stdio.h> 

#include 

<stdint.h> 

#include 

<stdlib.h> 

#include 

<sysexits.h> 

#include 

<unistd.h> 

#include 

<vis.h> 


3.2. EXAMPLE: READING AN ELF EXECUTABLE HEADER 


int 

main(int argc, char **argv) 

{ 


int i, f d ; 


Elf *e; 
char * id , 
size_t n; 

GElf_Ehdr 


bytes [5] ; 


ehdr ; 


a 


if (argc != 2) 

errx (EX_USAGE , " usage : u°/>Suf ile “name " , argv [0] ) 

if (elf.version(EV_CURRENT) == EV.NONE) 

errx(EX.SOFTWARE, "ELFulibraryuinitializationu 
"failediu’/os" , elf_errmsg(-l)) ; 

if ((fd = open(argv[1], O.RDONLY, 0)) < 0) 

err (EX.NOINPUT , " openu \ "’/.s \ " uf ailed " , argv[l]) 


if ((e = elf.begin(fd, ELF.C.READ , NULL)) == NULL) 
errx (EX. SOFTWARE , " elf .begin () uf ailed : u’/.s , 

elf.errmsg(-l)); 


if (elf.kind(e) != ELF.K.ELF) 

errx (EX.DATAERR , "\""/os\"uisunotuanuELFuobject . 
argv [1] ) ; 


a 


if (gelf.getehdr(e, feehdr) == NULL) 

errx (EX.SOFTWARE , "getehdr () ufailed : u’/.s . " , 
elf.errmsg(-l)); 


a 


if ((i = gelf.getclass(e)) == ELFCLASSNONE) 

errx (EX. SOFTWARE , "getclass () ufailed : u’/.s . 

elf.errmsg(-l)); 


(void) pr int f ( "'/oS : u’/.d-bit uELFu ob j ect\n " , argv [1] , 
i == ELFCLASS32 ? 32 : 64); 


if ((id = elf.getident(e, NULL)) == NULL) LLI 

errx (EX. SOFTWARE , "getident () ufailed : u’/.s . 

elf.errmsg(-l)); 

(void) pr int f ( "’/,3 Su e. ident [0 . ."/, 1 d] u’/.7 s " , "u". 

EI.ABIVERSION , "u"); 

for (i = 0; i <= EI.ABIVERSION; i++) I 

(void) vis(bytes, id[i], VIS.WHITE, 0); 

(void) printf ( " u [ ’’/.s ’ u’/.X] " , bytes, id[i]); 
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(void) printf("\n"); 

#define PRINT_FMT " uuuu’/. -20 SuOx"/. j x\n " 

#define PRINT_FIELD(N) do { \ 

(void) printf(PRINT_FMT, #N, (uintmax_t) ehdr.N); \ 

} while (0) 

PRINT_FIELD(e_type) ; Q 
PRINT_FIELD(e_machine); 

PRINT_FIELD(e_version); 

PRINT_FIELD(e_entry) ; 

PRINT_FIELD(e_phoff ) ; 

PRINT_FIELD(e_shoff); 

PRINT_FIELD(e_flags); 

PRINT_FIELD(e_ehsize); 

PRINT_FIELD(e_phentsize) ; 

PRINT_FIELD(e_shentsize); 


if ( elf_getshdrnum (e , &n) != 0) LlJ 

errx ( EX_ SOFTWARE , " get shdrnum ()ufailed:u"/iS. " , 

elf_errmsg(-1)); 

(void) printf(PRINT_FMT, "(shnum)", (uintmax_t) n); 

if (elf_getshdrstrndx(e, &n) != 0) I^J 

errx ( EX_ SOFTWARE , "getshdrstrndx()ufa.iled:u’/oS. " , 
elf_errmsg(-1)); 

(void) printf(PRINT_FMT, "(shstrndx)", (uintmax_t) n); 


a 


if (elf_getphdrnum (e , fen) != 0) 

errx ( EX_ SOFTWARE , " getphdrnum ()ufailed:u"/iS. " , 

elf_errmsg(-1)); 

(void) printf(PRINT_FMT, "(phnum)", (uintmax_t) n); 


(void) elf_end(e); 
(void) close (fd); 
exit(EX_0K); 


□ 


Programs using the GELF(3) API set need to include gelf .h. 


a 


The GELF(3) functions always operate on a local copies of data structures. 
The GElf _Ehdr type has fields that are large enough to contain values for 
a 64 bit ELF executable header. 


a 


We retrieve the ELF executable header using function gelf _getehdr. This 
function will translate the ELF executable header in the ELF object being 
read to the appropriate in-memory representation for type GElf _Ehdr. For 
example, if a 32-bit ELF object is being examined, then the values in its 
executable header would be appropriately converted (expanded and/or 
byteswapped) by this function. 
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The gelf_getclass function retrieves the ELF class of the object being 
examined. 


a 


Here we show the use of the elf _getident function to retrieve the contents 
of the e_ideiit [] array from the underlying file. These bytes would also 
be present in the e_ident member of the ehdr structure. 


We print the first few bytes of the e_ident field of the ELF executable 
header. 


a 


Following the e_ident bytes, we print the values of some of the fields of the 
ELF executable header structure. 


Q00 The functions elf_getphdrnum, elf_getshdrnmn and elf_get 

-shdrstrndx described in section 3.1 on page 19 should be used to retrieve 
the count of program header table entries, the number of sections, and 
the section name string table index respectively. Using these functions 
insulates your application from the quirks of extended numbering. 


Save the program in listing 3.1 on page 20 to file prog2. c and then compile 
and run it as shown in listing 3.2. 


Listing 3.2: Compiling and Running prog2 
prog2.c -lelf 

a 

object 


•/. cc -o prog2 

"/. ./prog2 prog2 
prog2: 64-bit ELF 
e_ident [0. .8] 

2] [’\-A> 

e_type 

e_machine 

e_version 

e_entry 

e_phoff 

e_shoff 

e_flags 

e_ehsize 

e_pheiitsize 

e_sheiitsize 

(shnum) 

(shstrndx) 

(phnum) 


[>\-?j 7F] [>E’ 45] 

1] [’\-A’ 1] 

0x2 

0x3e 

0x1 

0x400al0 

0x40 

0xl6f8 

0x0 

0x40 

0x38 

0x40 

0x18 

0x15 

0x5 


[’L’ 4C] [’F’ 46] \ 

9] [>\-@’ 0] 


□ 


The process for compiling and linking a GELF (3) using application is the 
same as that for ELF(3) programs. 


a 


We run our program on itself. This listing in this tutorial was generated on 
an AMD64'^'^ machine running FreeBSD"'"^. 


You should now run prog2 on other object files that you have lying around. 
Try it on a few non-native ELF object files too. 
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Chapter 4 


Examining the Program 
Header Table 


Before a program on disk can be executed by a processor it needs to brought 
into main memory. This process is conventionally called “loading”. 

When loading an ELF object into memory, the operating system views it 
as comprising of “segments”. Each such segment is a contiguous region of data 
inside the ELF object that is associated with a particular protection character¬ 
istic (for example, read-only or read-write) and that gets placed at a specific 
virtual memory address. 

For example, the FreeBSD'^'^ operating system expects executables to have 
an “executable” segment containing code, and a “data” segment containing 
statically initialized data.The executable segment would be mapped in with 
read and execute permissions and could be shared across multiple processes 
using the same ELF executable. The data segment would be mapped in with 
read and write permissions and would be made private to each process. For 
dynamically linked executables, the basic idea of grouping related parts of an 
ELF object into contiguous “segments” still holds, though there may be multiple 
segments of each type per process. 


4.1 The ELF Program Header Table 


The ELF program header table describes the segments present in an ELF file. 
The location of the program header table is described by the e_phof f field of the 
ELF executable header (see section 3.1 on page 16). The program header table 
is a contiguous array of program header table entries, one entry per segment. 

Figure 4.1 on the following page shows graphically how the fields of a pro¬ 
gram header table entry specify the segment’s placement in file and in memory. 

The structure of each program header table entry is shown in table 4.1 on 
the next page. 
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Figure 4.1: ELF Segment Placement 


Program Header Table Entry 



a 

a 

a 

a 

a 

a 

a 

a 


Table 4.1: ELF Program Header Table Entries 


32 bit PHDR Table Entry 

64 bit PHDR Table Entry 

typedef struct 

{ 

typedef struct 

{ 

Elf32_Word 

P-type; 

Elf64_Word 

P-type; 

Elf32_0ff 

p_offset; 

Elf64_Word 

p_flags; 

Elf32_Addr 

p_vaddr; 

Elf64_Dff 

p_offset; 

Elf32_Addr 

p_paddr; 

Elf64_Addr 

p_vaddr; 

Elf32_Word 

p_filesz; 

Elf64_Addr 

p_paddr; 

Elf32_Word 

p_memsz; 

Elf64_Xword 

p_filesz; 

Elf32_Word 

p_flags; 

Elf64_Xword 

p_memsz; 

Elf32_Word 
} Elf32_Phdr; 

p_align; 

Elf64_Xword 
} Elf64_Phdr; 

p_align; 


a 


The type of the program header table entry is encoded using this field. It 
holds one of the PT_* constants defined in the system headers. 

Examples include: 


• A segment of type PT_LDAD is loaded into memory. 

• A segment of type PT_N0TE contains auxiliary information. For exam¬ 
ple, core filesuse PT_NQTE sections to record the name of the process 
that dumped core. 

• A PT_PHDR segment describes the program header table itself. 

The ELF specification reserves type values from 0x60000000 (PT_L00S) to 
0x6FFFFFFF (PT_HI0S) for OS-private information. Values from 0x7000- 
0000 (PT_L0PR0C) to 0x7FFFFFFF (PT_HIPRDC) are similarly reserved for 
processor-specific information. 
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a 


The p_off set field holds the file offset in the ELF object to the start of the 
segment being described by this table entry. 


a 


The virtual address this segment should be loaded at. 


a 


The physical address this segment should be loaded at. This field does not 
apply for userland objects. 


a 


The number of bytes the segment takes up in the file. This number is zero 
for segments that do not have data associated with them in the file. 


a 


The number of bytes the segment takes up in memory. 


a 


Additional flags that specify segment properties. For example, flag PF_X 
specifies that the segment in question should be made executable and flag 
PF_W denotes that the segment should be writable. 


The alignment requirements of the segment both in memory and in the file. 
This field holds a value that is a power of two. 


Note: The careful reader will note that the 32- and 64- bit Elf _Phdr struc¬ 
tures are laid out differently in memory. These differences are handled for you 
by the functions in the libelf library. 


4.2 Example: Reading a Program Header Table 

We will now look at a program that will print out the program header table 
associated with an ELF object. We will continue to use the GELF(3) API set 
for this example. The ELF(3) API set also offers two ELF class-dependent APIs 
that retrieve the program header table from an ELF object: elf 32_getphdr and 
elf 64_getphdr, but these require us to know the ELF class of the object being 
handled. 

Listing 4.1: Program 3 

/* 

* Print the ELF Program Header Table in an ELF object. 

*/ 


#include 
#include 

#include 
#include 
#include 
#include 
#include 
#include 
#include 


<err.h> 

<f cntl.h> 


<gelf.h> 
<stdio.h> 
<stdint.h> 


□ 


<stdlib.h> 
<sysexits.h> 
<unistd.h> 

<vis.h> 


void 
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print_ptype(size_t 

{ 

char * s; 


pt) 


□ 


#define C(V) case 
switch (pt) {. 

C(NULL); 
C(INTERP) ; 
C(PHDR) ; 
C(SUNWBSS) 
C(SUNWCAP) 


PT_##V: s = #V; break 

C(LOAD); 

C(NOTE); 

C(TLS); 

; C(SUNWSTACK); 


def ault : 


s = "unknown" ; 
break; 

} 

(void) pr intf ( " u \ " "/• s \ , s); 

#undef C 


} 


C(DYNAMIC); 
C(SHLIB); 
C(SUNW_UNWIND); 
C(SUNWDTRACE); 


int 

main(int argc, char **argv) 

{ 

int i , fd; 

Elf *e; 

char *id, bytes [5] ; 
size_t n; 

GElf_Phdr phdr ; 3 
if (argc ! = 2) 

errx ( EX_USAGE , " usage : u"/«Suf ile “name " , argv [0] ) ; 

if (elf_version(EV_CURRENT) == EV_N0NE) 

errx(EX_SOFTWARE, "ELFulibrary□initializationu" 
"failed:u"/oS" , elf_errmsg(-l)) ; 

if ((fd = open(argv[l], 0_RD0NLY, 0)) < 0) 

err (EX_NOINPUT , " opeuu\ s\ " uf ailed " , argv[l]); 

if ((e = elf.begin (fd , ELF_C_READ, NULL)) == NULL) 
errx (EX.SOFTWARE , " elf .begin () uf ailed : u’/.s . " , 

elf.errmsg(-1)); 

if (elf.kind(e) != ELF.K.ELF) 

errx(EX.DATAERR , "\"°/.s\"uisunotuanuELFuobject . " , 

argv [1]) ; 


a 


if (elf.getphdrnum (e , fen) != 0) 

errx(EX.DATAERR, " elf .getphdrnum ()ufailed:u’/oS. " , 
elf.errmsg(-1)); 


for (i = 0; i < n; i++) { 


a 
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if 


(gelf_getphdr(e, i, fephdr) != fephdr) 
errx(EX_SOFTWARE, "getphdr()ufailed 

elf_errmsg(-l)); 


a 


j/.s. " , 


(void) pr int f ( " PHDRu’/od :\n " ) ; 

#define PRINT_FMT " uuuu"/. “20 SuOx’/. j x " 

#define PRINT_FIELD(N) do { \ 

(void) printf(PRINT_FMT, #N, (uintmax_t) phdr.N); \ 
}■ while (0) 

#define NL() do { (void) printf ("\n" ) ; }■ while (0) 


PRINT_FIELD(p_type); 3 
print_ptype(phdr.p_type); 

NLO 

PRINT_FIELD(p_offset); 

NLO 

PRINT_FIELD(p_vaddr); 

NLO 

PRINT_FIELD(p_paddr); 

NLO 

PRINT_FIELD(p_filesz); 

NLO 

PRINT_FIELD(p_memsz); 

NLO 

PRINT_FIELD(p_flags); 

(void) printf("u ["); 
if (phdr.p_flags & PF_X) 



(void) printf("uexecute"); 
if (phdr.p_flags & PF_R) 


(void) printf("uread"); 
if (phdr.p_flags & PF_W) 

(void) printf("uwrite"); 
printf("u] " ) ; NL() ; 

PRINT_FIELD(p_align) ; NL(); 


(void) elf_end(e); 
(void) close (fd) ; 
exit(EX_0K) ; 


□ 


We need to include gelf .h in order to use the GELF(3) APIs. 


a 


The GElf _Phdr type has fields that are large enough to contain the values 
in an Elf32_Phdr type and an Elf64_Phdr type. 


a 


We retrieve the number of program header table entries using the function 
elf _getphdrnuin. Note that the program header table is optional; for 
example, an ELF relocatable object will not have a program header table. 


Q0 


We iterate over all valid indices for the object’s program header table, 
retrieving the table entry at each index using the gelf _getphdr function. 


0Q 


We then print out the contents of the entry so retrieved. We use a 
helper function print_ptype to convert the p_type member to a readable 
string. 
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Save the program in listing 4.1 on page 27 to file prog3. c and then compile 
and run it as shown in listing 4.2. 



Listing 4.2: 

Compiling and Running j 

"/, cc -o progS 

prog3.c - 

lelf Q 

y, ./progS progS 3 


PHDR 0: 

p_type 


0x6 "PHDR" LLI 

p_off set 


0x34 

p_vaddr 


0x8048034 

p_paddr 


0x8048034 

p_filesz 


OxcO 

p_memsz 


OxcO 

p_flags 


0x5 [ execute read 

p_align 


0x4 

PHDR 1: 

p_type 


0x3 "INTERP" LiJ 

p_off set 


0xf4 

p_vaddr 


0x80480f4 

p_paddr 


0x80480f4 

p_filesz 


0x15 

p_memsz 


0x15 

p_flags 


0x4 [ read ] 

p_align 


0x1 

PHDR 2: 

p_type 


0x1 "LOAD" [D 

p_off set 


0x0 

p_vaddr 


0x8048000 

p_paddr 


0x8048000 

p_filesz 


0xe67 

p_memsz 


0xe67 

p_flags 


0x5 [ execute read 

p_align 


0x1000 

PHDR 3: 

p_type 


0x1 "LOAD" Q 

p_off set 


0xe68 

p_vaddr 


0x8049e68 

p_paddr 


0x8049e68 

p_filesz 


Oxllc 

p_memsz 


0xl3c 

p_flags 


0x6 [ read write ] 

p_align 


0x1000 

PHDR 4: 

p_type 


0x2 "DYNAMIC" 

p_off set 


0xe78 

p_vaddr 


0x8049e78 

p_paddr 


0x8049e78 

p_filesz 


0xb8 

p_memsz 


0xb8 

p_flags 


0x6 [ read write ] 


B B 
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p_align 
PHDR 5: 

p_type 
p_of f set 
p_vaddr 
p_paddr 
p_filesz 
p_memsz 
p_flags 
p_align 


0x4 

0x4 "NOTE" 

OxlOc 

0x804810c 

0x804810c 

0x18 

0x18 

0x4 [ read ] 

0x4 


Compile and link the program in the standard way. 


We make our program examine its own program header table. This listing 
was generated on an i386'^'^ machine running FreeBSD"'"^. 


a 


The very first entry in this program header table describes the program 
header table itself. 


a 


An entry of type PT_INTERP is used to point the kernel to the “interpreter” 
associated with this ELF object. This is usually a runtime loader, such as 
/libexec/ld-elf.so.1. 


This object has two loadable segments: one with execute and read 
permissions and one with read and write permissions. Both these segments 
require page alignment. 


You should now run progS on other object files. 

• Try a relocatable object file created by a cc -c invocation. Does it have 
an program header table? 

• Try progS on shared libraries. What do their program header tables look 
like? 

• Can you locate ELF objects on your system that have PT_TLS header 
entries? 
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Chapter 5 

Looking at Sections 


In the previous chapter we looked at the way an executable ELF objects are 
viewed by the operating system. In this section we will look at the features of 
the ELF format that are used by compilers and linkers. 

For linking, data in an ELF object is grouped into sections. Each ELF 
section represents one kind of data. For example, a section could contain a table 
of strings used for program symbols, another could contain debug information, 
and another could contain machine code. Non-empty sections do not overlap in 
the file. 

ELF sections are described by entries in an ELF section header table. This 
table is usually placed at the very end of the ELF object (see figure 3.1 on 
page 16). Table 5.1 describes the elements of section header table entry and 
figure 5.1 on page 35 shows graphically how the fields of an ELF section header 
specify the section’s placement. 


Table 5.1: ELF Section Header Table Entries 


32 bit SHDR Table Entry 

64 bit SHDR Table Entry 

□ 

typedef struct 

{ 

typedef struct 

{ 

Elf32_Word 

sh_naine; 

Elf64_Word 

sh_name; 

a 

Elf32_Word 

sh_type; 

Elf64_Word 

sh_type; 


Elf32_Xword 

sh_flags; 

Elf64_Xword 

sh_flags; 


Elf32_Addr 

sh_addr; 

Elf64_Addr 

sh_addr; 


Elf32_0ff 

sh_offset; 

Elf64_0ff 

sh_offset; 

a 

Elf32_Xword 

sh_size; 

Elf64_Xword 

sh_size; 

\B 

Elf32_Word 

sh_link; 

Elf64_Word 

sh_link; 

0 

Elf32_Word 

sh_inf 0 ; 

Elf64_Word 

sh_inf 0 ; 

□ 

Elf32_Word 

sh_addralign; 

Elf64_Word 

sh_addralign; 

a 

Elf32_Word 

sh_entsize; 

Elf64_Word 

sh_entsize; 


} Elf32_Shdr; 


} Elf64_Shdr; 



Q The 


sh_name field is used to encode a section’s name. 


As section names 
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are variable length strings, they are not kept in the section header table 
entry itself.Instead, all section names are collected into an object-wide 
string table holding section names and the shmamie field of each section 
header stores an index into the string table. The ELF executable header 
has an e_shstrndx member that points to the section index of this string 
table. ELF string tables, and the way to read them programmatically are 
described in section 5.1.1 on page 37. 


a 


The sh_type field specifies the section type. Section types are defined by 
the SHT_* constants defined in the system’s ELF headers. For example, a 
section of type SHT_PROGBITS is defined to contain executable code, while 
a section type SHT_SYMTAB denotes a section containing a symbol table. 


The ELF specification reserves values in the range 0x60000000 to 0x6FFF- 
FFFF to denote OS-specific section types, and values in the range 0x7000- 
0000 to 0x7FFFFFFF for processor-specific section types. In addition, 
applications have been given the range 0x80000000 to OxFFFFFFFF for 
their own use. 


a 


Section flags indicate whether a section has specific properties, e.g., whether 
it contains writable data or instructions, or whether it has special link 
ordering requirements. Flag values from 0x00100000 to 0x08000000 (8 
fiags) are reserved for OS-specific uses. Flags values from 0x10000000 to 
0x80000000 (4 fiags) are reserved for processor specific uses. 


a 


The sh_size member specifies the size of the section in bytes. 


\B\B The sh_link and sh_info fields contain additional additional section 
specific information. These fields are described in the elf(5) manual page. 


a 


For sections that have specific alignment requirements, the sh_addralign 
member holds the required alignment. Its value is a power of two. 


For sections that contain arrays of fixed-size elements, the sh_entsize mem¬ 
ber specifies the size of each element. 


There are a couple of other quirks associated with ELF sections. Valid 
section indices range from SHN_UNDEF (0) upto but not including SHN_LORESERVE 
(OxFFOO). Section indices between OxFFOO and OxFFFF are used to denote 
special sections (like FORTRAN COMMON blocks). Thus if an ELF file has 
more than 65279 (OxFFFF) sections, then it needs to use extended section 
numbering (see section 3.1 on page 19). 

The section header table entry at index ‘0’ (SHN_UNDEF) is treated specially: 
it is always of type SHT_NULL. It has its members set to zero except when ex¬ 
tended numbering is in use, see section 3.1 on page 19. 


5.1 ELF section handling with libelf 

You can conveniently retrieve the contents of sections and section headers us¬ 
ing the APIs in the ELF(3) library. Function elf_getscn will retrieve section 
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Figure 5.1: Section layout 


information for a requested section number.Iteration through the sections of an 
ELF file is possible using function elf jnextscn. These routines will take care 
of translating between in-file and in-memory representations, thus simplifying 
your application. 

In the ELF(3) API set, ELF sections are managed using Elf_Scn descriptors. 
There is one Elf_Scn descriptor per ELF section in the ELF object. Functions 
elf_getscn and elf_nextscn retrieve pointers to Elf_Scn descriptors for pre¬ 
existing sections in the ELF object. (Chapter 6 on page 43 covers the use of 
function elf jnewscn for allocating new sections).. 

Given an Elf _Scn descriptor, functions elf32_getshdr and elf 64_getshdr 
retrieve its associated section header table entry. The GELF(3) API set offers 
an equivalent ELF-class independent function gelf _getshdr. 

Each Elf_Scn descriptor can be associated with zero or more Elf_Data de¬ 
scriptors. Elf-Data descriptors describe regions of application memory that 
contain the actual data in the ELF section. Elf-Data descriptors for a given 
Elf-Scn descriptor are retrieved using the elf-getdata function. 

Figure 5.2 on the next page shows graphically how an Elf-Scn descriptor 
could conceptually cover the content of a section with Elf-Data descriptors. 

Figure 5.3 on the following page depicts how an Elf-Data structure describes 
a chunk of application memory. Note that the figure reflects the fact that the 
in-memory representation of data could have a different size and endianness 
than its in-file representation. 

Figure 5.1 shows the G definition of the Elf-Scn and Elf-Data descriptors. 


typedef 

typedef 


Listing 5.1: Definition of ElLData and ElLScn 


struct _Elf_Scn Elf_Scn; 
struct _Elf_Data { 

/* 

* ‘Public’ members that 
*/ 

uint64_t d_align; 


□ 


a 

a 


of the ELF(3) API. 


void 


* d_buf ; 
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Elf Sen Elf_Scn Descriptor 



Figure 5.2: Managing data in an Elf Section 



Figure 5.3: Elf _Data descriptors 
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Initial NUL 


NUL-separated strings 



’\0’ 

’N’ 

’a’ 

’m’ 

’e’ 

T’ 

’\0’ 

’N’ 

’a’ 

’m’ 

’e’ 

’T 

’\0’ 


’\0’ 


Terminating NUL 


Figure 5.4: String Table Layout 


uint64_t 

uint64_t 

Elf_Type 

unsigned int 
/* ... other 

} Elf_Data; 


d_off; Q 

d_size ; LLJ 

d_type ; L^J 

, IZ] 

d_version; L_J 
library-private fields 


*/ 


□ 

a 

a 


The Elf_Scn type is opaque to the application. 

The d_align member specifies alignment of data referenced in the Elf _Data 
with respect to its containing section. 

The d_buf member points to a contiguous region of memory holding data. 


[Zl The d_off member contains the file offset from the start of the section of 
the data in this buffer. This field is usually managed by the library, but 
is under application control if the application has requested full control of 
the ELF file’s layout (see chapter 6 on page 43). 


a 


The d_size member contains the size of the memory buffer. 


a 


The d_type member specifies the ELF type of the data contained in the 
data buffer. Legal values for this member are precisely those defined by 
the Elf_Type enumeration in libelf .h. 


a 


The d_version member specifies the working version for the data in this 
descriptor. It must be one of the values supported by the libelf library. 


Before we look at an example program we need to understand how string 
tables are implemented by libelf. 


5.1.1 String Tables 

String tables hold variable length strings, allowing other structures in an ELF 
object to refer to strings using offsets into the string table. Sections containing 
string tables have type SHT_STRTAB. 

Figure 5.4 shows the layout of a string table graphically: 
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• The initial byte of a string table is NUL (a ‘\0’). This allows an string 
offset value of zero to denote the NULL string. 

• Subsequent strings are separated by NUL bytes. 

• The final byte in the section is again a NUL so as to terminate the last 
string in the string table. 

An ELF file can have multiple string tables; for example, section names 
could be kept in one string table and symbol names in another. 

Given the section index of a section containing a string table, applications 
would use the elf _strptr function to convert a string offset to char * pointer 
usable by C code. 

5.2 Example: Listing section names 

Let us now write a program that would retrieve and print the names of the 
sections present in an ELF object. This example will show you how to use: 

• Functions elf_nextscn and elf_getscn to retrieve Elf_Scn descriptors. 

• Function gelf_getshdr to retrieve a section header table entry corre¬ 
sponding to a section descriptor. 

• Function elf _strptr to convert section name indices to NUL-terminated 
strings. 

• Function elf_getdata to retrieve translated data associated with a sec¬ 
tion. 

Listing 5.2: Program 4 

/* 

* Print the names of ELF sections. 

*/ 

#include <err.h> 

#include <fcntl.h> 

#include <gelf.h> 

#include <stdio.h> 

#include <stdint.h> 

#include <stdlib.h> 

#include <sysexits.h> 

#include <unistd.h> 

#include <vis.h> 

int 

mainCint argc, char +*argv) 

{ 

int fd; 

Elf *e; 

char *name, *p, pc [4*sizeof(char)] ; 

Elf_Scn *scn; 

Elf_Data *data; 
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GElf_Shdr shdr ; 
size_t n, shstrndx , sz ; 


if (argc != 2) 

errx (EX_USAGE , " usage : u°/>Suf ile “name " , argv [0] ) ; 

if (elf.version(EV_CURRENT) == EV.NDNE) 

errx(EX_SOFTWARE, "ELFulibraryuinitializationu" 
"failediu’/is" , elf_errmsg(-l)) ; 

if ((fd = openCargv[1], O.RDONLY, 0)) < 0) 

err (EX.NOINPUT , " openu\’/. s \ " uf ailed " , argv [1] ) ; 

if ((e = elf.begin(fd, ELF.C.READ , NULL)) == NULL) 
errx (EX. SOFTWARE , " elf .begin () uf ailed : u’/.s , 

elf.errmsg(-l)); 


if (elf.kind(e) != ELF.K.ELF) 

errx (EX.DATAERR , SuisunotuanuELFuobject . " , 

argv [1] ) ; 


□ 


if (elf.getshdrstrndx(e, feshstrndx) != 0) 

errx(EX.SOFTWARE , "elf.getshdrstrndx()ufailed:u"/iS. " , 
elf.errmsg(-l)); 


sen = NULL; [D 

while ((sen = elf.nextscn(e, sen)) != NULL) { 


a 


□ 


if (gelf.getshdr(sen, &shdr) != feshdr) 

errx (EX.SOFTWARE , " get shdr () uf ailed : u’/.s , 

elf.errmsg(-1)); 


if ((name = elf.strptr(e, shstrndx, shdr.sh.name)) 
== NULL) 0 

errx (EX.SOFTWARE , " elf .strptr () uf ailed : u’/.s , 

elf.errmsg(-l)); 


> 


(void) pr int f ( " Se et ionu’/. “4.4 j du’/. s\n " , (uintmax.t) 
elf.ndxsen(sen), name); 


if 


((sen = elf.getscn(e 
errx(EX.SOFTWARE , " 


, shstrndx)) == NULL) 
getsen()ufailed:u°/.s. " , 


a 


elf.errmsg(-l)); 


if (gelf.getshdr(sen, &shdr) != &shdr) 

errx ( EX.SOFTWARE , "getshdr(shstrndx)ufailed:u’/. s. " , 
elf.errmsg(-l)); 

(void) pr int f (". shstrab : u size =°/,j d\n " , (uintmax.t) 
shdr.sh.size ) ; 
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data = NULL; n = 0; 
while (n < shdr.sh_size kk 


(data = elf_getdata(sen, data)) 
p = (char *) data->d_buf; 
while (p < (char *) data->d_buf + 
if (vis(pc. *p, VIS_WHITE, 0)) 
pr intf ( " °/,s " , pc); 


!= NULL) { Q 
data->d_size) { 


n++; P++; 

(void) putchar((n "/, 16) ? ’u’ : ’\n’); 


} 

(void) put char(’\n’); 


(void) elf_end(e); 
(void) close (fd); 
exit(EX_0K); 


□ 


We retrieve the section index of the ELF section containing the string 
table of section names using function elf _getshdrstrndx. The use of 
elf-getshdrstrndx allows our program to work correctly when the ob¬ 
ject being examined has a very large number of sections. 


a 


Function elf_nextscn has the useful property that it returns the pointer 
to section number ‘1’ if a NULL section pointer is passed in. Recall that 
section number ‘0’ is always of type SHT_NULL and is not interesting to 
applications. 


a 


We loop over all sections in the ELF object. Function elf_nextscn will 
return NULL at the end, which is a convenient way to exit the processing 
loop. 


[3 Given a Elf_Scn pointer, we retrieve the associated section header using 
function gelf _getshdr. The shjiame member of this structure holds the 
required offset into the section name string table.indexsectionsiheader ta¬ 
ble entryiretrieval of 


a 


We convert the string offset in member sh_name to a char * pointer using 
function elf_strptr. This value is then printed using printf. 


a 


We retrieve the section descriptor associate with the string table holding 
section names. Variable shstrndx was retrieved by a prior call to function 
elf-getshdrstrndx. 


a 


We cycle through the Elf_Data descriptors associated with the section in 
question, printing the characters in each data buffer. 


Save the program in listing 5.2 on page 38 to file prog4. c and then compile 
and run it as shown in listing 5.3 on the next page. 


B E B 
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Listing 5.3: Compiling and Running prog4 


"/, c c - o 

pr og4 

prog4.c -lelf 

□ 



•/. ./proj 

g4 prog4 UiU 




Sect ion 

0001 . 

interp 




Sect ion 

0002 . 

note.ABI-tag 




Sect ion 

0003 . 

hash 




Sect ion 

0004 . 

dynsym 




Sect ion 

0005 . 

dynstr 




Sect ion 

0006 . 

rela.pit 




Sect ion 

0007 . 

init 




Sect ion 

0008 . 

pit 




Sect ion 

0009 . 

text 




Sect ion 

0010 . 

f ini 




Sect ion 

0011 . 

rodata 




Sect ion 

0012 . 

data 




Sect ion 

0013 . 

eh_f rame 




Sect ion 

0014 . 

dynamic 




Sect ion 

0015 . 

ct or s 




Sect ion 

0016 . 

dt or s 




Sect ion 

0017 . 

j cr 




Sect ion 

0018 . 

got 




Sect ion 

0019 . 

bss 




Sect ion 

0020 . 

comment 




Sect ion 

0021 . 

shstrtab 




Sect ion 

0022 . 

symtab 




Sect ion 

0023 . 

strtab 




.shstrab : size 

1=287 [i] 




\“@ . s 

y m t 

a b \~® . s t 

r t 

a 

b 

\“@ . s 

h s t 

r t a b \~® . 

i n 

t 

e 

r p \“(§ 

. h a 

s h \~® . d y 

n s 

y 

m 

...etc . . 







a 


Compile and link the program in the standard way. 

We make our program print the names of its own sections. 

One of the sections contains the string table used for sections names them¬ 
selves. This section is called .shstrtab by convention. 

This is the content of the string table holding section names. 
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Chapter 6 


Creating new ELF objects 


We will now look at how ELF objects can be created (and modified, see sec¬ 
tion 6.2.4 on page 49) using the libelf library. 

Broadly speaking, the steps involved in creating an ELF file with libelf 

are: 

1. An ELF descriptor needs to be allocated with a call to elf _begin, passing 
in the parameter ELF_C_WR1TE. 

2. You would then allocate an ELF executable header using one of the 
elf32jnewehdr, elf64jnewehdr or gelfmewehdr functions. Note that 
this is a mandatory step since an ELF executable header is always present 
in an ELF object. The ELF “class”, of the object, i.e., whether the object 
is a 32-bit or 64-bit one, is fixed at this time. 

3. An ELF program header table is optional and can be allocated using one of 
functions elf 32_newphdr, elf 64jnewphdr or gelf jnewphdr. The program 
header table can be allocated anytime after the executable header has been 
allocated. 

4. Sections may be added to an ELF object using function elf_newscn. 
Elf_Data descriptors associated with an ELF section can be added to 
a section descriptor using function elf_newdata. ELF sections can be 
allocated anytime after the object’s executable header has been allocated. 

5. If you are creating an ELF object for a non-native architecture, you can 
change the byte ordering of the object by changing the byte order byte at 
offset E1_DATA in the ELF header. 

6. Once your data is in place, you then ask the libelf library to write out 
the final ELF object using function elf .update. 

7. Finally, you close the ELF descriptor allocated using function elf _end. 

6.1 Example: Creating an ELF object 

In listing 6.1 on the next page we will look at a program that creates a simple 
ELF object with a program header table, one ELF section containing translat- 
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able data and one ELF section containing a section name string table. We will 
mark the ELF of the object as using a 32-bit, MSB-first data ordering. 


Listing 6.1: Program 5 




* Create an ELF object. 
*/ 


#include 

<err.h> 


#include 

<f cntl.h> 

a 

#include 

<libelf.h> 

#include 

<stdio.h> 


#include 

<stdlib.h> 


#include 

<sysexits . 

h> 

#include 

<unistd.h> 


uint32_t 

hash.words [] = 


{ 0 


0x01234567, 
0x89abcdef, 
OxdeadcOde 


}; 


string.table [] 

= { a 




/* Offset 0 */ 

’\0’ , 




/* Offset 1 */ 

’.’, ’f ’, 

’o’, 

’o’. 

’\0 

/* Offset 6 */ 

’ • ’ . ’s’. 

’h’ , 

’s’. 

’t ’ 


’r ’ , ’t ’ , 

’ a’ , 

’b’ , 

’\0 


}; 


int 

mainCint argc, char **argv) 

{ 

int fd; 

Elf *e; 

Elf_Scn *scn; 

Elf_Data *data; 
Elf32_Ehdr *ehdr; 
Elf32_Phdr *phdr; 
Elf32_Shdr *shdr; 


if (argc ! = 2) 

errx ( EX_USAGE , " usage : u"/«Suf ile “name " , argv [0] ) ; 

if (elf_version(EV_CURRENT) == EV_N0NE) 

errx(EX_S0FTWARE, "ELFulibrary□initializationu" 
"failed:u"/oS" , elf_errmsg(-l)) ; 

if ((fd = open(argv [1] , 0_WR0NLYI0_CREAT , 0777)) < 0) 

err (EX_0SERR , " openu\°/. s \ " uf ailed " , argv[l]); 


a 


if ((e = elf.begin(fd, ELF_C_WRITE, NULL)) == NULL) 


a 
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errx (EX_SOFTWARE , " elf _begin () uf ailed : u’/.s , 

elf_errmsg(-l)); 


if ((ehdr = elf32_newehdr(e)) == NULL) 

errx (EX_SOFTWARE , " elf 32_newehdr () uf ailed : u"/.s , 

elf_errmsg(-l)); 


a 


ehdr->e_ident [EI_DATA] = ELFDATA2MSB; 

ehdr->e_machine = EM_PPC; /* 32-bit PowerPC object */ 

ehdr->e_type = ET_EXEC; 



if ((phdr = eIf32_newphdr(e, 1)) == NULL) 

errx (EX_SOFTWARE , " elf 32_newphdr () uf ailed : u"/.s , 

elf_errmsg(-l)); 



if ((sen = elf_newscn(e)) == NULL) I _ I 

errx ( EX_ SOFTWARE , "elf_newscn()ufailed:u/ls. " , 
elf_errmsg(-l)); 

if ((data = elf_newdata(sen)) == NULL) 

errx (EX_SOFTWARE , " elf _newdat a ( ) uf ailed : u’/.s , 

elf_errmsg(-l)); 

data->d_align = 4; 
data->d_off = OLL; 
data->d_buf = hash_words; 
data->d_type = ELF_T_W0RD; 
data->d_size = sizeof(hash_words); 
data->d_version = EV_CURRENT; 

if ((shdr = eIf32_getshdr(sen)) == NULL) 

errx (EX_SOFTWARE , " elf 32_getshdr () uf ailed : u’/.s , 

elf_errmsg(-l)); 

shdr->sh_name = 1; 
shdr->sh_type = SHT_HASH; 
shdr->sh_flags = SHF_ALL0C; 
shdr->sh_entsize = 0; 



if ((sen = elf_newsen(e)) == NULL) 

errx ( EX_ SOFTWARE , "elf_newsen()ufailed:u°/.s. " , 
elf_errmsg(-l)); 

if ((data = elf_newdata(sen)) == NULL) 

errx (EX_S0FTWARE , " elf _newdat a ( ) uf ailed : u’/.s , 

elf_errmsg(-l)); 

data->d_align = 1; 
data->d_buf = string_table; 
data->d_off = OLL; 

data->d_size = sizeof(string_table); 
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data->d_type = ELF_T_BYTE; 
data->d_version = EV_CURRENT; 

if ((shdr = elf32_getshdr ( sen)) == NULL) 

errx (EX_SOFTWARE , " elf 32_get shdr () uf ailed : u’/.s , 

elf_errmsg(-1)); 

shdr->sh_name = 6; 

shdr->sh_type = SHT_STRTAB; 

shdr->sh_flags = SHF_STRINGS I SHF_ALLOC; 

shdr->sh_entsize = 0; 


elf_setshstrndx(e, elf_ndxscn(sen)); 


. ^ 




if (elf_update(e, ELF_C_NULL) < 0) 

errx (EX_S0FTWARE , " elf .update ( NULL ) uf ailed : u°/.s . " , 

elf.errmsg(-1)); 


phdr->p_type = PT.PHDR; 

phdr->p_offset = ehdr->e.phoff; 

phdr->p_filesz = elf32_fsize(ELF_T_PHDR, 1, EV.CURRENT); 


(void) elf_flagphdr(e, ELF_C_SET, ELF_F_DIRTY); 


if (elf.update(e, ELF.C.WRITE) < 0) 

errx (EX.SOFTWARE , " elf .update () uf ailed : u"/.s , 

elf.errmsg(-1)); 


(void) elf.end(e); 
(void) elose (fd); 


exit(EX.OK); 

} 


□ 


We include libelf .h to bring in prototypes for libelf ’s functions. 


a 


We will create an ELF section containing ‘hash’ values. These values are 
present in host-native order in the array hash.words. These values will 
be translated to the appropriate byte order by the libelf library when 
the object file is created. 


a 


We use a pre-fabricated ELF string table to hold section names. See sec¬ 
tion 5.1.1 on page 37 for more information on the layout of ELF string 
tables. 


a 


The first step to create an ELF object is to obtain a file descriptor from the 
OS that is opened for writing. 


a 


By passing parameter ELF.C.WRITE to function elf.begin, we obtain an 
ELF descriptor suitable for creating new ELF objects. 
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a 


We allocate an ELF executable header and set the EI_DATA byte in its 
e_ident member. The machine type is set to EM_PPC denoting the Pow¬ 
erPC architecture, and the object is marked as an ELF executable. 


a 


We allocate an ELF program header table with one entry. At this point 
of time we do not know how the ELF object will be laid out so we don’t 
know where the ELF program header table will reside. We will update 
this entry later. 


We create a section descriptor for the section containing the ‘hash’ values, 
and associate the data in the hash_words array with this descriptor. The 
type of the section is set to SHT_HASH. The library will compute its size and 
location in the final object and will byte-swap the values when creating 
the ELF object. 


a 


We allocate another section for holding the string table. We use the pre¬ 
fabricated string table in variable string_table. The type of the section 
is set to SHT_STRTAB. Its offset and size in the file will be computed by the 
library.indexsections!string tablelallocation of 



We set the string table index field in the ELF executable header using the 
function elf_setshstrndx. 



Calling function elf_update with parameter ELF_C_NULL indicates that 
the libelf library is to compute the layout of the object, updating all 
internal data structures, but not write it out. We can thus fill in the 
values in the ELF program header table entry that we had allocated using 
the new values in the executable header after this call to elfmpdate. 
The program header table is then marked “dirty” using a call to function 
elf _flagdata, so that a subsequent call to elf_update will use the new 
contents. 



A call to function elfmpdate with parameter ELF_C_WR1TE causes the 
object file to be written out. 


Save the program in listing 6.1 on page 44 to file progb. c and then compile 
and run it as shown in listing 6.2. 

Listing 6.2: Compiling and Running prog5 

"/. cc -o progS progS . c -lelf 
•/. ./prog5 foo 

"/, file foo 

foo: ELF 32-bit MSB executable, PowerPC or cisco 4500, \ 
version 1 (SYSV), statically linked, stripped 

"/, readelf -a foo I^J 
ELF Header: 

Magic: 7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00 

Class: ELF32 

Data: 2’s complement , big endian 
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Version: 

OS/ABI: 

ABI Version: 

Type : 

Machine: 

Version: 

Entry point address: 

Start of program headers: 
Start of section headers: 
Flags: 

Size of this header : 

Size of program headers: 
Number of program headers: 
Size of section headers: 
Number of section headers: 
Section header string table 
. etc . . . 


1 (current) 

UNIX - System V 
0 

EXEC (Executable file) 

PowerPC 

0x1 

0x0 

52 (bytes into file) 
112 (bytes into file) 
0x0 

52 (bytes) 

32 (bytes) 

1 

40 (bytes) 

3 

index: 2 


□ 


Compile, link and run the program in the standard way. 




We use the file and readelf programs to examine the object that we 
have created. 


6.2 The finer points in creating ELF objects 

Some of the finer points in creating ELF objects using the libelf library are 
examined below. We cover memory management rules, ELF data structure 
lifetimes, and how an application can take full control over an object’s layout. 
We also briefly cover how to modify an existing ELF object. 

6.2.1 Controlling ELF Layout 

By default, the libelf library will lay out your ELF objects for you. The default 
layout is shown in figure 3.1 on page 16. An application may request fine-grained 
control over the ELF object’s layout by setting the flag ELF_F.LAYOUT on the 
ELF descriptor using function elf_flagelf. 

Once an ELF descriptor has been flagged with flag ELF_F.LAYOUT the follow¬ 
ing members of the ELF data structures come under application control: 

• The e.phoff and e.shoff fields, which determine whether the ELF pro¬ 
gram header table and section header table start. 

• For each section, the sh.addralign, sh.offset, and sh.size fields in its 
section header. 

These fields must set prior to calling function elf .update. 

The library will All “gaps” between parts of the ELF file with a fill character. 
An application may set the All character using the function elf.fill. The 
default All character is a zero byte. 
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6.2.2 Memory Management 

Applications pass pointers to allocated memory to the libelf library by setting 
the d_buf members of Elf_Data structures passed to the library. The libelf 
library also passes data back to the application using the same mechanism. In 
order to keep tracking memory ownership simple, the libelf library follows the 
rule that it will never attempt to free data that it did not allocate. Conversely, 
the application is also not to free memory allocated by the libelf library. 

6.2.3 libelf data structure lifetimes 

As part of the process of writing out an ELF object, the libelf library may 
release or reallocate its internal bookkeeping structures. 

A rule to be followed when using the libelf library is that all pointers to 
returned data structures (e.g., pointers to Elf_Scn and ElfJData structures or 
to other ELF headers become invalid after a call to function elf_update with 
parameter ELF_C_WR1TE. 

After a successful call to function elfmpdate all ELF data structures will 
need to be retrieved afresh. 

6.2.4 Modifying existing ELF objects 

The libelf library also allows existing ELF objects to be modified. The process 
is similar to that for creating ELF objects, the differences being: 

• The underlying file object would need to be opened for reading and writ¬ 
ing, and the call to function elf.begin would use parameter ELF_C_RDWR 
instead of ELF_C_WR1TE. 

• The application would use the elf_get* APIs to retrieve existing ELF 
data structures in addition to the elf_new* APIs used for allocating new 
data structures. The libelf library would be informed of modifications 
to ELF data structures by calls to the appropriate elf _flag* functions. 


The rest of the program flow would be similar to the object creation case. 
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Chapter 7 

Processing ar(l) archives 


The libelf library also offers support for reading archives members in an ar(l) 
archive. This support is “read-only”; you cannot create new ar(l) archives 
or update members in an archive using these functions. The libelf library 
supports both random and sequential access to the members of an ar(l) archive. 

7.1 Archive structure 

Each ar(l) archive starts with a sequence of 8 signature bytes (see the constant 
ARMAG defined in the system header ar. h) . The members of the archive follow, 
each member preceded by an archive header describing the metadata associated 
with the member. Figure 7.1 on the following page depicts the structure of an 
ar(l) archive pictorially. 

Each archive header is a collection of fixed size ASCII strings. Archive 
headers are required to reside at even offsets in the archive file. Figure 7.1 
shows the layout of the archive header as a C structure. 

Listing 7.1: Archive Header Layout 

struct ar_hdr { 


char 

ar. 

.name [16] ; 


file 

name */ 




char 

ar. 

.date [12] ; 

/* 

file 

modifica 

,tion 

time 

*/ 

char 

ar. 

.uid [6] ; 

/* 

creat 

or user 

id * 

/ 


char 

ar. 

.gid [6] ; 

/* 

creat 

or group 

i d 

*/ 


char 

ar. 

.mode [8] ; 


octal 

file pe 

:rmis 

sions 

*/ 

char 

ar. 

.size [10] ; 


size 

in bytes 

*/ 



: ine 


ARFMAG 

1 

' ‘\n" 





char 

ar. 

.fmag [2] ; 

/* 

consi 

stency c 

■.heck 

*/ 



} __packed; 

The initial members of an ar(l); archive may be special: 

• An archive member with name “/” is an archive symbol table. An archive 
symbol table maps program symbols to archive members in an archive. It 
is usually maintained by tools like ranlib and ar. 

• An archive member with name “//” is an archive string table. The mem¬ 
bers of an ar(l) header only contain fixed size ASCII strings with space 
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Archive Header Entry 


Figure 7.1: The structure of ar(l) archives 


and ‘/’ characters being used for string termination. File names that ex¬ 
ceed the length limits of the armamie member are handled by placing them 
in a special string table (not to be confused with ELF string tables) and 
storing the offset of the file name in the ar_name member as a string of 
decimal digits. 

The archive handling functions offered by the libelf library insulate the 
application from these details of the layout of ar(l) archives. 

7.2 Example: Stepping through an ar(l) archive 

We now illustrate (listing 7.2) how an application may iterate through the mem¬ 
bers of an ar(l) archive. The steps involved are: 

1. Archives are opened using elf .begin in the usual way. 

2. Each archive managed by the libelf library tracks the next member to 
opened. This information is updated using the functions elf jiext and 
elf _r and. 

3. Nested calls to function elf .begin retrieve ELF descriptors for the mem¬ 
bers in the archive. 

Figure 7.2 on the next page pictorially depicts how functions elf .begin and 
elf .next are used to step through an ar(l) archive. 

We now look at an example program that illustrates these concepts. 




Listing 7.2: Program 6 
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AR magic Archive headers 


Figure 7.2: Iterating through ar(l) archives with elf _begin and elf_next 


* Iterate through an ar (1) archive. 
*/ 


#include 

<err.h> 

#include 

<f cntl.h> 

#include 

<libelf.h> 

#include 

<stdio.h> 

#include 

<stdlib.h> 

#include 

<sysexits.h> 

#include 

<unistd.h> 

int 

main(int 

argc, char * 


int fd; 

Elf *ar, *e; 
Elf_Arhdr *arh; 


if (argc != 2) 

errx (EX_USAGE , " usage : u°/>Suf He “name " , argv [0] ) ; 

if (elf.version(EV_CURRENT) == EV.NDNE) 

errx(EX_SOFTWARE, "ELFulibraryuinitializationu" 
"failed:u’/»s" , elf_errmsg(-l)) ; 


if ((fd = open(argv[1], O.RDONLY, 0)) < 0) 

err (EX.NOINPUT , " openu\7. s \ " uf ailed " , argv [1] ) ; 


if 


((fd 

err 


= open(argv[l], O.RDONLY 
(EX.NOINPUT, " openu\"/.s\"u 


, 0)) < 0) Q 

failed" , argv [1]) ; 


if 


((ar = elf.begin (fd , ELF.C.READ , NULL)) == NULL) 
errx (EX. SOFTWARE , " elf .begin () uf ailed : u’/.s , 


a 
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elf_errmsg(-1)); 

if (elf_kind(ar) != ELF_K_AR) 

errx (EX_DATAERR , "°/,Suisunotuanuar (1) uarchive . " , 

argv [1]) ; 


while ((e = elf_begin(fd, ELF_C_READ, ar)) != NULL) { 


a 


□ 


if ((arh = elf_getarhdr(e)) == NULL) 

errx (EX_SOFTWARE , " elf _getarhdr () uf ailed : u"/.s . 

elf_errmsg(-1)); 


(void) printf ( " "/,20su"/od\n" , arh->ar_name , 
arh->ar_size); 


> 


(void) 
(void) 


elf _next(e) 
elf _end ( e ) ; 


; a 
a 


(void) elf_end(ar); 
(void) close (fd); 
exit (0) ; 


□a 


We open the ar(l) archive for reading and obtain a descriptor in the 
usual manner. 


a 


Function elf_begin is used to the iterate through the members of the 
archive. The third parameter in the call to elf .begin is a pointer to the 
descriptor for the archive itself. The return value of function elf .begin 
is a descriptor that references an archive member. 


a 


We retrieve the translated ar(l) header using function elf .getarhdr. We 
then print out the name and size of the member. Note that function 
elf .getarhdr translates names to null-terminated C strings suitable for 
use with printf. 


Figure 7.3 shows the translated information returned by elf .getarhdr. 


Listing 7.3: The Elf.Arhdr Structure 

typedef struct { 


time. 

t 

ar _ 

dat e ; 

/* 

t ime 

of 

c 

reatio 

n */ 

char 


* ar 

_name; 

/* 

archi 

ve 

m 

ember 

name */ 

gid.t 


ar _ 

gid ; 

/* 

Great 

or 

■' s 

group 

*/ 

mode. 

t 

ar _ 

mode ; 

/* 

file 

cr 

ea 

tion m 

ode */ 

char 


* ar 

_rawname; 

/* 

■’ raw ■' 

m 

em 

her name */ 

size. 

t 

ar _ 

size; 

/* 

memhe 

r 

s i 

ze in 

bytes */ 

uid.t 


ar _ 

uid ; 

/* 

Great 

or 

■' s 

user 

id */ 


} Elf.Arhdr; 


7.2. EXAMPLE: STEPPING THROUGH AN AR(1) ARGHIVE 
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a 


The elf_next function sets up the parent archive descriptor (referenced by 
variable ar in this example) to return the next archive member on the 
next call to function elf .begin. 


a 


It is good programming practice to call elf _end on descriptors that are no 
longer needed. 


Save the program in listing 7.2 on page 52 to file prog6. c and then compile 
and run it as shown in listing 7.4. 


7. 


7. 


Listing 7.4: Compiling and Running prog6 


cc -o prog6 prog6.c -lelf 

./prog6 /usr/lib/librt.a 
timer.o 7552 
mq.o 8980 
aio.o 8212 

sigev_thread.o 15528 


□ 

a 


□ 


Compile and link the program in the usual fashion. 


a 


We run the program against a small library and get a list of its members. 


7.2.1 Random access in an ar(l) archive 

Random access in the archive is supported by the function elf _rand. However, 
in order to use this function you need to know the file offsets in the archive for 
the desired archive member. For archives containing object files this information 
is present in the archive symbol table. 

If an archive has an archive symbol table, it can be retrieved using the 
function elf .getarsym. Function elf .getarsym returns an array of Elf_Arsym 
structures. Each Elf _Arsym structure (figure 7.5) maps one program symbol to 
the file offset inside the ar(l) archive of the member that contains its definition. 

Listing 7.5: The Elf _Arsym structure 

typedef struct i 

off_t as_off; /* byte offset to member header */ 

unsigned long as_hash; /* elf_hash() value for name */ 

char *as_name; /* null terminated symbol name */ 

} Elf_Arsym; 

Once the file offset of the member is known, the function elf_rand can be 
used to set the parent archive to open the desired archive member at the next 
call to elf .begin. 
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Chapter 8 

Conclusion 


This tutorial covered the following topics: 

• We gained an overview of the facilities for manipulating ELF objects of¬ 
fered by the ELF(3) and GELF(3) API sets. 

• We studied the basics of the ELF format, including the key data structures 
involved and their layout inside ELF objects. 

• We looked at example programs that retrieve ELF data structures from 
existing ELF objects. 

• We looked at how to create new ELF objects using the ELF(3) library. 

• We looked at accessing information in the ar(l) archives. 

8.1 Further Reading 

8.1.1 On the Web 

Peter Seebach’s DeveloperWorks article “An unsung hero: The hardworking 
ELF” covers the history and features of the ELF format. Other tutorials include 
Hongjiu Liu’s “ELF: From The Programmer’s Perspective”, which covers GCC 
and GNU Id, and Michael L. Haung’s “The Executable and Linking Format 
(ELF)”. 

Neelakanth Nadgir’s tutorial on ELF(3) and GELF(3) is a readable and brief 
introduction to the ELF(3) and GELF(3) APIs for Solaris"'"'^. 

The Linkers and Libraries Guide from Sun Microsystems® describes linking 
and loading tools in Solaris"'"'^. Ghapter 7 of this book, “Object File Format” 
contains a readable introduction to the ELF format. 

8.1.2 More Example Programs 

The source code for the tools being developed at the ElfToolGhain Project at 
SourceForge.Net show the use of the ELF(3)/GELF(3) APIs in useful programs. 

For readers looking for smaller programs to study, Emmanuel Azencot offers 
a website with example programs. 
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8.1.3 Books 

John Levine’s “Linkers and Loaders”, is a readable book offering a overview of 
the process of linking and loading object files. 

8.1.4 Standards 

The current specification of the ELF format, the “Tool Interface Standard (TIS) 
Executable and Linking Format (ELF) Specification, Version 1.2” is freely avail¬ 
able to download. 


8.2 Getting Further Help 

If you have further questions about the use of libelf, please feel free to use 
our discussion list: elftoolchain-developers@lists.sourceforge.net. 
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